Large Scale Distributed Deep Networks
نویسندگان
چکیده
Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. Downpour SGD and Sandblaster L-BFGS both increase the scale and speed of deep network training. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories. We show that these same techniques dramatically accelerate the training of a more modestlysized deep network for a commercial speech recognition service. Although we focus on and report performance of these methods as applied to training large neural networks, the underlying algorithms are applicable to any gradient-based machine learning algorithm.
منابع مشابه
DeepSpark: Spark-Based Deep Learning Supporting Asynchronous Updates and Caffe Compatibility
The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data process pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep l...
متن کاملLearning Deep Representations By Distributed Random Samplings
In this paper, we propose an extremely simple deep model for the unsupervised nonlinear dimensionality reduction – deep distributed random samplings. First, its network structure is novel: each layer of the network is a group of mutually independent k-centers clusterings. Second, its learning method is extremely simple: the k centers of each clustering are only k randomly selected examples from...
متن کاملA multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملPartitioning Large Scale Deep Belief Networks Using Dropout
Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on smallscale problems with fewer parameters. In this paper, we consider a well-known ...
متن کاملA New Load-Flow Method in Distribution Networks based on an Approximation Voltage-Dependent Load model in Extensive Presence of Distributed Generation Sources
Power-flow (PF) solution is a basic and powerful tool in power system analysis. Distribution networks (DNs), compared to transmission systems, have many fundamental distinctions that cause the conventional PF to be ineffective on these networks. This paper presents a new fast and efficient PF method which provides all different models of Distributed Generations (DGs) and their operational modes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012